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1 

We show that the shadow vertex algorithm can be used to compute a short path between 
a given pair of vertices of a polytope P — {x g R n : Ax < b} along the edges of P, where 
A £ R mxn . Both, the length of the path and the running time of the algorithm, are polynomial 
in m, n, and a parameter 1/5 that is a measure for the flatness of the vertices of P. For integer 
matrices A £ % mxn we show a connection between S and the largest absolute value A of any 

i | sub-determinant of A, yielding a bound of 0(A 4 mn 4 ) for the length of the computed path. 

This bound is expressed in the same parameter A as the recent non-constructive bound of 

£ 0(A 2 n 4 log(nA)) by Bonifas et al. pQ. 

For the special case of totally unimodular matrices, the length of the computed path 
simplifies to 0(mn 4 ), which significantly improves the previously best known constructive 

1 ' bound of 0(m 16 n 3 log 3 (mn)) by Dyer and Frieze [?]• 






1 Introduction 
in 






O 



We consider the following problem: Given a matrix A = [cti, . . . , a m ] T £ ]R mx ", a vector b £ M. m , 

i- and two vertices x\ and xi of the polytope P = {x £ M™ : Ax < &}, find a short path from X\ 

to X2 along the edges of P efficiently. In this context efficient means that the running time of the 
algorithm is polynomially bounded in to, n, and the length of the path it computes. Note, that 
^•j the polytope P does not have to be bounded. 

i-H The diameter d(P) of the polytope P is the smallest integer d that bounds the length of the 

shortest path between any two vertices of P from above. The polynomial Hirsch conjecture states 
that the diameter of P is polynomially bounded in m and n for any matrix A and any vector b. 
As long as this conjecture remains unresolved, it is unclear whether there always exists a path of 
polynomial length between the given vertices x\ and x%- Moreover, even if such a path exists, it is 
open whether there is an efficient algorithm to find it. 

Related work The diameter of polytopes has been studied extensively in the last decades. In 
1957 Hirsch conjectured that the diameter of P is bounded by to — n for any matrix A and any 
vector b (see Dantzig's seminal book about linear programming [6]). This conjecture has been 
disproven by Klee and Walkup 9J who gave an unbounded counterexample. However, it remained 
open for quite a long time whether the conjecture holds for bounded polytopes. More than fourty 
years later Santos [T2] gave the first counterexample to this refined conjecture showing that there 
are bounded polytopes P for which d(P) > (1 + e) ■ m for some e > 0. This is the best known 
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lower bound today. On the other hand, the best known upper bound of O(m 1+logn ) due to Kalai 
and Kleitman [5] is only quasi-polynomial. It is still an open question whether d(P) is always 
polynomially bounded in m and n. This has only been shown for special classes of polytopes like 
0/1 polytopes, flow-polytopes, and the transportation polytope. For these classes of polytopes 
bounds of m — ft (Naddef [lOJ). 0(mnlogn) (Orlin [H]), and 0(m) (Brightwell et al. [3]) have 
been shown, respectively. On the other hand, there are bounds on the diameter of far more 
general classes of polytopes that depend polynomially on m, n, and on additional parameters. 
Recently, Bonifas et al. [T] showed that the diameter of polytopes P defined by integer matrices A 
is bounded by a polynomial in n and a parameter that depends on the matrix A. They showed 
that d(P) — 0(A 2 n 4 log(nA)), where A is the largest absolute value of any sub-determinant of A. 
Although the parameter A can be very large in general, this approach allows to obtain bounds 
for classes of polytopes for which A is known to be small. For example, if the matrix A is totally 
unimodular, i.e., if all sub-determinants of A are from {—1,0,1}, then their bound simplifies 
to Oir& log n), improving the previously best known bound of 0(m le n 3 log {ran)) by Dyer and 
Frieze [7]. 

We are not only interested in the existence of a short path between two vertices of a polytope 
but we want to compute such a path efficiently. It is clear that lower bounds for the diameter 
of polytopes have direct (negative) consequences for this algorithmic problem. However, upper 
bounds for the diameter do not necessarily have algorithmic consequences as they might be non- 
constructive. The aforementioned bounds of Orlin, Brightwell et al., and Dyer and Frieze are 
constructive, whereas the bound of Bonifas et al. is not. 

Our contribution We give a constructive upper bound for the diameter of the polytope P = 
{x g K.™ : Ax < b} for arbitrary matrices A £ u mx ™ a nd arbitrary vectors b G M' Tl F] This bound 
is polynomial in m, n, and a parameter 1/5, which depends only on the matrix A and is a measure 
for the angle between edges of the polytope P and their neighboring facets. We say that a facet F 
of the polytope P is neighboring an edge e if exactly one of the endpoints of e belongs to F. The 
parameter 5 denotes the smallest sine of any angle between an edge and a neighboring facet in P. 
If, for example, every edge is orthogonal to its neighboring facets, then 6 = 1. On the other hand, 
if there exists an edge that is almost parallel to a neighboring facet, then 6 « 0. The formal 
definition of S is deferred to Section [5] 

A well-known pivot rule for the simplex algorithm is the shadow vertex rule, which gained atten- 
tion in recent years because it has been shown to have polynomial running time in the framework 
of smoothed analysis |13| . We will present a randomized variant of this pivot rule that computes 
a path between two given vertices of the polytope P. We will introduce this variant in Section [2] 
and we call it shadow vertex algorithm in the following. 

Theorem 1. Given vertices x\ and x-i of P, the shadow vertex algorithm efficiently computes a 

l 2s. 

path from x\ to x-i on the polytope P with expected length Of^pH. 

Let us emphasize that the algorithm is very simple and its running time depends only polyno- 
mially on m, n and the length of the path it computes. 

TheoremfTldoes not resolve the polynomial Hirsch conjecture as the value 6 can be exponentially 
small. Furthermore, it does not imply a good running time of the shadow vertex method for 
optimizing linear programs because for the variant considered in this paper both vertices have to 
be known. Contrary to this, in the optimization problem the objective is to determine the optimal 
vertex. To compare our results with the result by Bonifas et al. [T], we show that, if A is an integer 
matrix, then i < n ■ A 2 , which yields the following corollary. 

Corollary 2. Let A <G Z mxn be an integer matrix and let b <G W™ be a real-valued vector. Given 
vertices X\ and X2 of P, the shadow vertex algorithm efficiently computes a path from X\ to rr 2 on 
the polytope P with expected length O(Aran). 

1 Note that we do not require the polytope to be bounded. 



This bound is worse than the bound of Bonifas et al., but it is constructive. Furthermore, if A 
is a totally unimodular matrix, then A = 1. Hence, we obtain the following corollary. 

Corollary 3. Let A G g mx ™ fr e a totally unimodular matrix and let b G W 71 be a vector. Given 
vertices x\ and Xi of P, the shadow vertex algorithm efficiently computes a path from X\ to X2 on 
the polytope P with expected length 0{mn ). 

This is a significant improvement upon the previously best known constructive bound of 
0(m 1(i n 3 log 3 (mn)) due to Dyer and Frieze because we can assume m> n. Otherwise, P does not 
have vertices and the problem is ill-posed. 

Organization of the paper In Section|2]we describe the shadow vertex algorithm. In Scction[4] 
we give an outline of our analysis and present the main ideas. After that, in Section[5j we introduce 
the parameter S and discuss some of its properties. Section[6]is devoted to the proof of Tlicorcm[T] 
The probabilistic foundations of our analysis are provided in Section [7] 

2 The Shadow Vertex Algorithm 

Let us first introduce some notation. For an integer n G N we denote by [n] the set {1, . . . ,n}. 
Let A G R mxn be an m x n-matrix and let i G [m] and j G [n] be indices. With Aij we refer to 
the (m — 1) x (n — l)-submatrix obtained from A by removing the i th row and the j th column. 
We call the determinant of any k x fc-submatrix of A a sub-determinant of A of size k. By I n we 
denote the n x n-identity matrix diag(l, . . . , 1) and by O mxn the m x n-zero matrix. If n G N is 
clear from the context, then we define vector ei to be the i th column of I„. For a vector x G K™ we 
denote by ||x|| = ||x|J2 the Euclidean norm of x and by Af(x) — ttKt ■ x for i^O the normalization 
of vector x. 

2.1 Shadow Vertex Pivot Rule 

Our algorithm is inspired by the shadow vertex pivot rule for the simplex algorithm. Before 
describing our algorithm, we will briefly explain the geometric intuition behind this pivot rule. 
For a complete and more formal description, we refer the reader to [5] or |13j . Let us consider 
the linear program minc T a; subject to x G P for some vector c G R" and assume that an initial 
vertex x\ of the polytope P is known. For the sake of simplicity, we assume that there is a unique 
optimal vertex x* of P that minimizes the objective function c T x. The shadow vertex pivot rule 
first computes a vector w G R n such that the vertex x\ minimizes the objective function w T x 
subject to i € P. Again for the sake of simplicity, let us assume that the vectors c and w are 
linearly independent. 

In the second step, the polytope P is projected onto the plane spanned by the vectors c and w. 
The resulting projection is a polygon P' and one can show that the projections of both the initial 
vertex x\ and the optimal vertex x* are vertices of this polygon. Additionally every edge between 
two vertices x and y of P' corresponds to an edge of P between two vertices that are projected 
onto x and y, respectively. Due to these properties a path from the projection of Xi to the projection 
of x* along the edges of P' corresponds to a path from X\ to x* along the edges of P. 

This way, the problem of finding a path from x\ to x* on the polytope P is reduced to finding a 
path between two vertices of a polygon. There are at most two such paths and the shadow vertex 
pivot rule chooses the one along which the objective c T x improves. 

2.2 Our Algorithm 

As described in the introduction we consider the following problem: We are given a matrix 
A = [ai,...,a m ] T G M. mxn , a vector b G M. m , and two vertices Xi,X2 of the polytope P = 



{x £ R™ : Ax < b}. Our objective is to find a short path from x\ to X2 along the edges of P. 

We propose the following variant of the shadow vertex pivot rule to solve this problem: First 
choose two vectors Wi,W2 £ R™ such that xi uniquely minimizes wjx subject to x £ P and xi 
uniquely maximizes wjx subject to x £ P. Then project the polytope onto the plane spanned 
by w\ and u>2 in order to obtain a polygon P' . Let us call the projection n. By the same arguments 
as for the shadow vertex pivot rule, it follows that 7r(a:i) and tt(x2) are vertices of P' and that a 
path from Tt{x\) to ^(#2) along the edges of P' can be translated into a path from X\ to X2 along 
the edges of P. Hence, it suffices to compute such a path to solve the problem. Again computing 
such a path is easy because P' is a two-dimensional polygon. 

The vectors w\ and W2 are not uniquely determined, but they can be chosen from cones that 
are determined by the vertices x\ and X2 and the polytope P. We choose w\ and W2 randomly 
from these cones. A more precise description of this algorithm is given as Algorithm [T] 

Algorithm 1 Shadow Vertex Algorithm 

l: Determine n linearly independent rows uj of A for which ujxi = &&. 

2: Determine n linearly independent rows vj of A for which 7^X2 = bk. 

3: Draw vectors A,/i £ (0, 1]™ independently and uniformly at random. 

4: Set tui = - [Af(ui), . . . ,JV(u n )] ■ A and w 2 = [AT(vi), . . . ,Af(v n )] ■ p. 

5: Use the function 7r : a; 1 — ?> (wjx, wjx) to project P onto the Euclidean plane and obtain the 
shadow vertex polygon P' = tt(P). 

6: Walk from 7r(xi) along the edges of P' in increasing direction of the second coordinate un- 
til tt(x2) is found. 

7: Output the corresponding path of P. 

Let us give some remarks about the algorithm above. The vectors Ui, . . . ,u n in Line [l] and 
the vectors t>i,... ,v n in Line [2] must exist because X\ and X2 are vertices of P. The only point 
where our algorithm makes use of randomness is in Line [3] By the choice of W\ and u>2 in 
Linekl x\ is the unique optimum of the linear program minwjx s.t. x £ P and X2 is the unique 
optimum of the linear program maxwjx s.t. x £ P. The former follows because for any y £ P 
with y ^ x\ there must be an index k £ [n] with ujxi < bk- The latter follows analogously. Note, 
that ||wi|| < J2k=i^k ' l!-^"( u fc)ll — J2k=i^k < n and, similarly, \\1v2W — n - The shadow vertex 
polygon P' in Line p] has several important properties: The projections of X\ and X2 are vertices 
of P' and all edges of P' correspond to projected edges of P. Hence, any path on the edges of P' 
is the projection of a path on the edges of P. Though we call P' a polygon, it does not have to 
be bounded. This is the case if P is unbounded in the directions w\ or — W2- Nevertheless, there 
is always a path from x\ to X2 which will be found in Line [6] For more details about the shadow 
vertex pivot rule and formal proofs of these properties, we refer to the book of Borgwardt [2]. 

To give a bit intuition why these statements hold true, consider the projection depicted in 
Figure [TJ We denote the first coordinate of the Euclidean plane by £ and the second coordinate 
by 77. Since w\ and W2 are chosen such that x\ and X2 are, among the points of P, optimal for 
the function x 1— > wjx and x ^ wjx, respectively, the projections ir(xi) and tt(x2) of x\ and X2 
must be the leftmost vertex and the topmost vertex of P' = 7r(P), respectively. As P' is a (not 
necessarily bounded) polygon, this implies that if we start in vertex ir(xi) and follow the edges 
of P' in direction of increasing values of 77, then we will end up in tt(x2) after a finite number of 
steps. This is not only true if P' is bounded (as depicted by the dotted line and the dark gray 
area) but also if P is unbounded (as depicted by the dashed lines and the dark gray plus the light 
gray area). Moreover, note that the slopes of the edges of the path from ir(xi) to n(x2) are positive 
and monotonically decreasing. 




Figure 1: Shadow polygon P' 



3 Degeneracy 



Any degenerate polytope P can be made non-degenerate by perturbing the vector b by a tiny 
amount of random noise. This way, another polytope P is obtained that is non-degenerate with 
probability one. Any degenerate vertex of P at which t > n constraints are tight generates at 
most ( ) vertices of P that are all very close to each other if the perturbation of b is small. We say 
that two vertices of P that correspond to the same vertex of P are in the same equivalence class. 

If the perturbation of the vector b is small enough, then any edge between two vertices of P in 
different equivalence classes corresponds to an edge in P between the vertices that generated these 
equivalence classes. We apply the shadow vertex algorithm to the polytope P to find a path R 
between two arbitrary vertices from the equivalence classes generated by X\ and 22, respectively. 
Then we translate this path into a walk from x\ to x 2 on the polytope P by mapping each vertex 
on the path R to the vertex that generated its equivalence class. This way, we obtain a walk 
from X\ to X2 on the polytope P that may visit vertices multiple times and may also stay in the 
same vertex for some steps. In the latter type of steps only the algebraic representation of the 
current vertex is changed. As this walk on P has the same length as the path that the shadow 
vertex algorithm computes on P, the upper bound we derive for the length of R also applies to 
the degenerate polytope P. 

Of course the perturbation of the vector b might change the shape of the polytope P. In this 
context it is important to point out that the parameter 8, which we define in the following, only 
depends on the matrix A and, thus, is independent of the right-hand side b. Consequently, the 
parameter 8 of the original polytope P can also be used to describe the behavior of the shadow 
vertex simplex algorithm on the polytope P. 

4 Outline of the Analysis 

In the remainder of this paper we assume that the polytope P is non-degenerate, i.e., for each 
vertex x of P there are exactly n indices i for which ajx = 6j. This implies that for any edge 
between two vertices x and y of P there are exactly n — 1 indices i for which ajx — ajy = bi. 
According to Section [3] this assumption is justified. 

From the description of the shadow vertex algorithm it is clear that the main step in proving 
Theoremnlis to bound the expected number of edges on the path from ir(xi) to ir{x2) on the poly- 
gon P'. In order to do this, we look at the slopes of the edges on this path. As we discussed above, 
the sequence of slopes is monotonically decreasing. We will show that due to the randomness in 
the objective functions W\ and 102, it is even strictly decreasing with probability one. Furthermore 



all slopes on this path are bounded from below by 0. 

Instead of counting the edges on the path from 7r(aii) to ir(x2) directly, we will count the number 
of different slopes in the interval [0, 1] and we observe that the expected number of slopes from the 
interval [0, oo) is twice the expected number of slopes from the interval [0, 1]. In order to count 
the number of slopes in [0, 1], we partition the interval [0, 1] into several small subintervals and we 
bound for each of these subintervals / the expected number of slopes in /. Then we use linearity of 
expectation to obtain an upper bound on the expected number of different slopes in [0, 1], which 
directly translates into an upper bound on the expected number of edges on the path from n(xi) 
to tt{x 2 ). 

We choose the subintervals so small that, with high probability, none of them contains more 
than one slope. Then, the expected number of slopes in a subinterval / = (t, t + e] is approximately 
equal to the probability that there is a slope in the interval /. In order to bound this probability, 
we use a technique reminiscent of the principle of deferred decisions that we have already used 
in [5] . The main idea is to split the random draw of the vectors w\ and w 2 in the shadow vertex 
algorithm into two steps. The first step reveals enough information about the realizations of these 
vectors to determine the last edge e = {p,p*) on the path from tt(xi) to n(x 2 ) whose slope is bigger 
than t (see Figure[2]). Even though e is determined in the first step, its slope is not. We argue that 
there is still enough randomness left in the second step to bound the probability that the slope 
of e lies in the interval (t, t + e] from above, yielding Theorem [l] 

We will now give some more details on how the random draw of the vectors w\ and w 2 is 
partitioned. Let x and x* be the vertices of the polytope P that are projected onto p and p*, 
respectively. Due to the non-degeneracy of the polytope P, there are exactly n — 1 constraints 
that are tight for both x and x* and there is a unique constraint ajx < b^ that is tight for x* 
but not for x. In the first step the vector w\ is completely revealed while instead of w 2 only an 
element w 2 from the ray {w 2 + 7 • a,i : 7 > 0} is revealed. We then argue that knowing w\ and w 2 
suffices to identify the edge e. The only randomness left in the second step is the exact position 
of the vector w 2 on the ray {w 2 — 7 • a^ : 7 > 0}, which suffices to bound the probability that the 
slope of e lies in the interval (t, t + e]. 

Let us remark that the proof of Theorem [T] is inspired by the recent smoothed analysis of the 
successive shortest path algorithm for the minimum-cost flow problem [3] . Even though the general 
structure bears some similarity, the details of our analysis are much more involved. 

5 The Parameter d 

In this section we define the parameter <5 that describes the flatness of the vertices of the polytope 
and state some relevant properties. 

Definition 4. 

1. Let z\, ■ ■ ■ , Z n € R. n be linearly independent vectors and let if € (0, £] be the angle between z n 
and the hyperplane spanjzi, . . . , 2 n _i}. By 6({zi, . . . , 2 n -i} > z n) = sirup we denote the sine 
0/ angle ip. Moreover, we set 6(zi, . . . , z n ) — min fce r„i 5({zi : i £ [n] \ {k}} , z^). 

2. Given a matrix A = [ai, . . . , a m ] T G M. mxn , we set 

d(A) = min {5(ai 1 , . . . , ai n ) : a^ , . . . , ai n linearly independent} . 

The value 6({zi, . . . , z n _i} , z n ) describes how orthogonal z n is to the span of z\, . ■ ■ , z n -\- If 
ip ~ 0, i.e., z n is close to the span of Zi, ... , z n _i, then 5{{z\, . . . , z n -\} , z n ) fa 0. On the other 
hand, if z n is orthogonal to Z\, . . . , 2 n _i, then ip = | and, hence, 5({zi, . . . , z„-i} > z n) = 1- The 
value 8({z±, . . . , z„_i} , z n ) equals the distance between both faces of the parallelotope Q, given by 
Q = |y^._ 1 <Xj • Af(zi) : a, € [0, 1]}, that are parallel to span{z 1; . . . , z n _i} and is scale invariant. 



The value 8(zi, . . . , z n ) equals twice the inner radius r n of the parallelotope Q and, thus, is a 
measure of the flatness of Q: A value S(z\, . . . , z n ) f=a implies that Q is nearly (n — l)-dimensional. 
On the other hand, if 8(zi, . . . , z n ) — 1, then the vectors z\, . . . ,z n are pairwise orthogonal, that 
is, Q is an n-dimcnsional unit cube. 

The next lemma lists some useful statements concerning the parameter 6:= 8(A) including a 
connection to the parameters Ai, A„_i, and A introduced in the paper of Bonifas et al. pp. 

Lemma 5. Let zi,...,z n G R n be linearly independent vectors, let A G K roXfl be a matrix, let 
b G K m be a vector, and let S = 8(A). Then, the following claims hold true: 

1. If M is the inverse of [N(zi), . . . , AT^n)} 1 - , then 

xi \ - l ^ V™ 

0(Zi,...,z n ) — r j7 S 1, A/f ,1 > 

max fce[rl] ||m fe || max fee[n] ||M fc || 

where [mi, . . . , m n ] — M and [Mi, . . . , M n ] = M T . 

2. If Q £ M nx " is an orthogonal matrix, then S(Qzi, . . . , Qz n ) = 8(zi, . . . , z n ). 

3. Let yi and y 2 oe two neighboring vertices of P = {x G R™ : Ax < b} and let aj be a row 
of A. If aj ■ (y 2 -yi) ^ 0, then \aj ■ (y 2 - yi)\ > 8 ■ \\y 2 -yi\\- 

4- If A is an integral matrix, then 4 < nAiA„_i < nA 2 , where A, Ai, and A„_i are the largest 
absolute values of any sub-determinant of A of arbitrary size, of size I, and of size n — 1, 
respectively. 

Proof. First of all we derive a simple formula for 8({zi, . . . , z n _i} , z n ). For this, assume that the 
vectors zi, . . . , Z n are normalized. Now consider a normal vector x ^ of spanjzi, . . . , z„_i} that 
lies in the same halfspace as z n . Let (p G (0, V\ be the angle between z n and spanjzi, . . . , z„_i} 
and let ip £ [0, f ) be the angle between z n and x. Clearly, ip + i(j = -|. Consequently, 

T 

<5({zi, . . . ,z„_i} ,z n ) = simp = sin ( - - ip) = cosifj = -^—r . 

\2 I ||a;|| 

The last fraction is invariant under scaling of x. Since x and z n lie in the same halfspace, w.l.o.g. we 
can assume that z'lx = 1. Hence, 8({zi, . . . , z„_i} , z n ) = n— n-, where x is the unique solution of the 

equation [z\, . . . , z n _i, z n ] T ■ x = (0, . . . , 0, 1) T = e„. If the vectors Zi, . . . , z n are not normalized, 
then we obtain 

8({zi,...,z n -i},z n ) = 8({Af(zi), . .. . ,AT(z„_i)} ,N(z n )) = ij-ir , 

\m\ 

where x = [N"(zi), . . . , Af (z n -i) , J\f (z n )]~ T ■ e n . Since for the previous line of reasoning we can 
relabel the vectors z\ , . . . , z n arbitrarily, this implies 

8(zi, 



I ) • * • ) ^n ) 



ke[n] ||[A/'(zi),...,A/'(z„)]- T -efc|| 

1 
max {||x|| : x is column of [Af(zi) 7 . . . , Af(z n )]~ T } 

This yields the equation in Claim [l] Due to 

2 



(max||Mfc|| ) < V ||Af fc || 2 = V \\m k \\ 2 < n ■ (max||m fe | 
\fce[n] J £-f ■f-f \ke[n] 

wc obtain the inequality n it < n » n stated in Claim 1 

1 ■> max lEl „] ||m fc || — max fce( „] \\M k \\ 
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For Claim [2] observe that 

[Af(Q Zl ), . . . ,Af(Qz n )]- T = [QM{z x ), ..., QAf(z n )]- T 

= (Q-[N( Zl ),...,U(z n )])- T 

= ({Af(z 1 ),...,Af(zn)}- 1 -Q T ) T 
= Q-{Af( Zl ),...,Af(z n )]- T 

for any orthogonal matrix Q. Therefore, we get 

— — — — - = max{||x|| : x is column of \Af(Qzi),. . . , J\f (Q z n )]~ T } 

5(QZi,...,QZn) 

= max{||<5y|| : y is column of \N{z\), ■ ■ . ,Af(z n )]~ T } 
= max{||y|| : y is column of [Af(zi), . . . ,Af(z n )]~ T } 
1 

6(zi,...,Zn) ' 

For Claim [3] let y\ and y 2 be two neighboring vertices of P. Then, there are exactly n— 1 indices j 
for which aj ■ (y 2 — J/i) = 0. We denote them by j\ , . . . , j„_ 1 . If there is an index i for which aj ■ {y 2 — 
yi) 7^ 0, then ctj 1 ,...,aj n _ 1 ,ai are linearly independent. Consequently, d(aj 1 ,...,aj n _ 1 ,Oi) > S. 
Let us assume that aj ■ (1/2 — Vi) > 0. (Otherwise, consider aj ■ (j/x — 2/2) instead.) Since y 2 — J/i 
is a normal vector of spanjojj , . . . , dj n _ 1 } that lies in the same halfspace as <ij, we obtain 

aj-(y 2 -yi) j.,, x , , x , w , 

— — =0(1%!,..., %„_!) ,OiJ > d{a n ,...,a jn _ 1 ,a i ) > d 

\m - inn 

and, thus, aj ■ (y 2 - yi) > 6 - \\y 2 - yi\\- 

For proving Claim [4] we can focus on showing the first inequality. The second one follows from 
A > max{A!, A„_!}. For this, it suffices to show that for n arbitrary linearly independent rows 
aj , . . . ,aj of A the inequality 

1 A A 

< 7iAiA„_i 



S({ a hy ■ ■ > a i n -i} ) a i n ) 

holds. By previous observations we know that 

1 



5({a il ,...,a in _ 1 } ,a ln ) 

where x is the unique solution of Ax — e n for A = [/^(a^), . . . , J\f(a in )] T . Let A = [a^, . . . , aiJ T . 
Then, 



A „ = A / det(i^) V = A / det(A„, fc ) ■ TTg ^ 
-'' ^{ dct(i) J ;M det(I)-n- =1 pb 



fc=i fc= 

2 



2 



E/det(i n , fe ) • ||a lT J|\ -A /A n _i--y/nAi\ ., : , ., 

Some of the equations need further explanation: Due to Cramer's rule, we have Xk — , %■. , 

where A is obtained from A by replacing the fc th column by the right-hand side e n of the equation 
Ax = e n . Laplace's formula yields | det(A)| = | det(A„.fc)|. Hence, the second equation is true. For 



the third equation note that the k th row of matrix A is the same as the k th row of matrix A up to 
a factor of „ „ . The inequality follows from I det(A n fc)| < A n _i since this is a sub-determinant 

ll a «J,ll _ _ ' 

of A of size n — 1, from ||<Zj n || < y/n • ||ai n ||oo ^ \A*Ai, since ||aj n ||oo is a sub-determinant of A of 
size 1, and from | det(A)| > 1 since A is invertible and integral by assumption. Hence, 

1 



S({a ii: . 



,di 



-J.' 



< nAiA„_ 



n 



6 Analysis 

For the proof of Theorem fl] we assume that \\a,i\\ = 1 for all i € [to]. This entails no loss of 
generality since normalizing the rows of matrix A (and scaling the right-hand side b appropriately) 
does neither change the behavior of our algorithm nor does it change the parameter 5 — 5(A). 

For given linear functions L\ and L 2 , we denote by 7r = wi, lt L2 the function 7r: K ra — > R 2 , given 
by -k(x) = (L\{x), £2(2;)). Note, that n-dimensional vectors can be treated as linear functions. 
By P' — P' L L we denote the projection ir(P) of polytope P onto the Euclidean plane, and by 
R = Rl 1 ,l 2 we denote the path from 7r(a;i) to ■k(x2) along the edges of polygon P' . 



Our goal is to bound the expected number of edges of the path R = R. u 



which is random 



since w\ and w 2 depend on the realizations of the random vectors A and /1. Each edge of R 
corresponds to a slope in (0, 00). These slopes are pairwise distinct with probability one (see 
Lemma pi). Hence, the number of edges of R equals the number of distinct slopes of R. In order 
to bound the expected number of distinct slopes we first restrict our attention to slopes in the 
interval (0, 1] . 

Definition 6. For a real e > let ¥ £ denote the event that there are three pairwise distinct vertices 
Zi,Z2,Z3 of P such that z\ and Z3 are neighbors of zi and such that 



T 



(-22 - Zl) 



T 
W 2 



(23 - z 2 ) 



Z2) 



< 



'1 • (z 2 - «i) W\ ■ {z 3 

Note that if event F e does not occur, then all slopes of R differ by more than e. Particularly, 
all slopes are pairwise distinct. First of all we show that event F e is very unlikely to occur if e is 
chosen sufficiently small. 

Lemma 7. The probability that there are two neighboring vertices Z\, z 2 of P such that \wj ■ (z 2 — 

zi)\ < s ■ \\z 2 — zi\\ is bounded from above by 2m s e . 

Proof. Let z\ and z 2 be two neighbors of P. Let A 2 = z 2 — z\ . Because the claim we want to show 
is invariant under scaling, we can assume without loss of generality that ||A 2 || = 1. There are 



n — 1 indices i\, 
where A = (Ai, . 
that ai 1 , . . . , Oi n 



-1 G [to] such that aj zi = 6, 



a\ h z 2 . Recall that w\ 



[Ul, 



■A, 



, A„) is drawn uniformly at random from (0, 1]™. There must be an index i such 
, Ui are linearly independent. Hence, k:—u[A z 7^ and, thus, |«| > 8 due to 

Lemma [5j Claim [3] 

We apply the principle of deferred decisions and assume that all Xj for j =/= i are already drawn. 

Then 



wjA z 



LA; 



<A, 






ujA z 



-A; 



K ■ 



Thus, 



\wjA z \ < e 



wjA z e 1 - --.--; 



Xi-n€ [7-e,7 + e] 



7 £ 7 £ 
K \k\ ' K \k\ 



The probability for the latter event is bounded by the length of the interval, i.e., by ^4 < tt • Since 



we have to consider at most 
the additional factor of m n . 



< m n pairs of neighbors (z%, z 2 ), applying a union bound yields 

a 



Lemma 8. The probability of event ¥ e tends to for e — > 0. 

Proof. Let Z\ , z 2 , z 3 be pairwise distinct vertices of P such that Zi and z 3 are neighbors of z% and 
let A z :=Z2 — zx and A' 2 :=Z3 — z 2 . We assume that ||A Z || = ||A' 2 || = 1. This entails no loss of 
generality as the fractions in Definition [6] are invariant under scaling. Let ii,...,i n _i € [to] be 
the 



1 indices for which a, z\ = hi. — a, z<>- The rows a;, . 



_ x arc linearly independent 



because P is non-degenerate. Since Zx,z%,Z3 are distinct vertices of P and since z\ and z 3 are 
neighbors of z 2 , there is exactly one index i( for which 0^2:3 < bi e , i.e., aj A' z 7^ 0. Otherwise, 
Zi,Z2,Z3 would be collinear which would contradict the fact that they are distinct vertices of P. 
Without loss of generality assume that £ = n— 1. Since aj A z = for each k E [n— 1], the vectors 
ajj , . . . , a,i ri _ 1 , A z are linearly independent. 

We apply the principle of deferred decisions and assume that w\ is already fixed. Thus, wj A z 
and wjA' z are fixed as well. Moreover, we assume that wj A z 7^ and wj A' z 7^ since this 
happens almost surely due to LemmaP7| Now consider the matrix M — [a^, . . . , aj„_,, A z ,di n _ 1 ] 

For fixed values 
. ,j/„_i). Then 



and the random vector (Yi, . . . , Y n -i, Z) T = M 1 • w 2 = M 1 
2/1, ... , j/n-i let us consider all realizations of /x for which (Yi, . 



[Vi,...,V n \ ■ [I. 
>Yn-l) = (Vl,- 



^A z = (M-(y 1; .. 



,y n -!,Z) T ) A z 



n-1 

^y k -aj : A 2 



k=l 
2/n-l 



ik 



A'f'A. 



Z • a, 1 A 2 



i.e., the value of u>jA z does not depend on the outcome of Z since A z is orthogonal to all <Zj 
For AC we obtain 



: AC = (M-(2/ 1 ,...,2/„_i,Z) T ) A z 

2/„_! • AjA' 2 + Z ■ aj n 
y n ^-AjA' z +Z-al_A' z 



n-1 

E 
&=i 



v*-aS A i 



A' 



as A' z is orthogonal to all ai k except for k 



1 . The chain of equivalences 



w 2 A z 



^K 



w{ A z w{ A' z 



<e 



wjA' z 



wjA> z 



w 2 A z 
wjA z 



wjA z 
wjA z 



"JA Z G 



wjA 2 



wjA 2 



wJK 



.\ w ?A> z \,^. W ?A' z + e- 



w{A z 



HK 



Z-al A' e 



wJA z 



■wjA' z -K-e-\w?A z \ 



w 2 A z 



Ta 1 x ri"zh Ta 

w^A z W\A Z 

implies, that for event F e to occur Z must fall into an interval / = I(yi, 



wjA' z -« + £• ItOi'A'j,! 



2£- 






The probability of this is bounded from above by 



of length 



2n ■ 2e ■ 



\w?K\ 
l<-i A il 



An- \wjA' z \ 



6(ri,..,,r n ) ■min fce [ n ] ||r fe || 5(n,...,r n ) ■ min feeM ||r fe || • \aJ n _ i A' z \ 



•e, 
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Figure 2: Slopes of the vertices of R 
where [n, . . . ,r n ] = M^ 1 ■ [v\, . . . , v n ]. This is due to (Yi, . . . , Y n _i, Z) T — [n, . . . , r n ] • ^ and 



Theorem 15 Since the vectors ri, . . . ,r« are linearly independent, we have S(ri, . . . , r n ) > and 
min fee r„i ||rfc|| > 0. Furthermore, \aj _ A' z \ > since i n -i is the constraint which is not tight for z 3 , 



but for z%. Hence, 7 < 00, and thus Pr 



<£ 



for e ->• 0. 



rn in triples (zi, z 2 , Z3) we have to consider, the claim follows by applying 
a union bound. □ 

Let p ^ 77(^2) be a vertex of R. We call the slope s of the edge incident to p to the right of p 
the slope of p. As a convention, we set the slope of 7r(x2) to which is smaller than the slope of 
any other vertex p of R. 

Let t > be an arbitrary real, let p be the right-most vertex of R whose slope is larger than t, 
and let p* be the right neighbor of p (see Figure I2J. Let x and x* be the neighboring vertices 
of P with ir(x) = p and ir(x*) = p*. Now let i — i(x*,x) £ [m] be the index for which ajx* = bi 
and for which x is the (unique) neighbor x of x* for which ajx < bi. This index is unique due 
to the non-degeneracy of the polytope P. For an arbitrary real 7 > we consider the vector 
w 2 = w 2 +7 • en- 

Lemma 9. Let n — n Wl ,w 2 an d ^ R — Rw t ,w 2 be the path from tt(x\) to : n{x2) in the projection 
P' = P^ jj of polytope P. Furthermore, let p* be the left-most vertex of R whose slope does not 
exceed t. Then, p* — 7f(x*). 

Let us reformulate the statement of Lemma [9] as follows: The vertex p* is defined for the 
path R of polygon P' with the same rules as used to define the vertex p* of the original path R 
of polygon P' . Even though R and R can be very different in shape, both vertices, p* and p*, 
correspond to the same solution x* in the polytope P, that is, p* — tt(x*) and p* = tt(x*). Let us 
remark that Lemma [9] is a significant generalization of Lemma 4.3 of 4J. 

Proof. We consider a linear auxiliary function w^: K n — > K, given by Wzix) = w^x — 7 • 6j. The 
paths R = R Wl .w 2 an d R are identical except for a shift by —7 • bi in the second coordinate because 
for 7f = i^ Wl ,w2 we obtain 

7f (x) = (wjx, w^x - 7 • bi) = (wjx, w^x) - (0, 7 • bi) = n(x) - (0, 7 • bi) 



for all x £ R n . Consequently, the slopes of R and R are exactly the same (see Figure 3a). 

Let x £ P be an arbitrary point from the polytope P. Then, wjx — wjx+j-ajx < w^x+'y-bi. 
The inequality is due to 7 > and ajx < bi for all x £ P. Equality holds, among others, for x — x* 
due to the choice of e^. Hence, for all points x £ P the two-dimensional points ■k{x) and ff(x) 
agree in the first coordinate while the second coordinate of ir(x) is at least the second coordinate 
of 7f(x) as u>2(x) = wjx — 7 • bi < wjx. Additionally, we have 7r(ir*) = 7f(x*). Thus, path R is 
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(a) Relation between R and R 



~ ? 

(b) Relation between R an R 



Figure 3: Relations between R, R, and R 



below path R but they meet at point p* — n(x*). Hence, the slope of R to the left (right) of p* 
is at least (at most) the slope of R to the left (right) of p* which is greater than (at most) t (see 
Figure 3b I. Consequently, p* is the left-most vertex of R whose slope does not exceed t. Since R 
and R are identical up to a shift of — (0, 7 • bi), n(x*) is the left-most vertex of R whose slope does 
not exceed t, i.e., tt(x*) = p*. □ 



Lemma n)\ holds for any vector 1I12 on the ray r = {ui2 + 7 • at : 7 > 0}. As 
Section |2.2[ ), we have W2 G [— n,n] n . Hence, ray r intersects the boundary of 
point z. We choose W2 = W2{w2,i) '■— z and obtain the following result. 



Wall < n ( see 
n,n] n in a unique 



Corollary 10. Let n — K Wl> w 2 ( W2 ,i) an ^ tet P* be the left-most vertex of path R 
whose slope does not exceed t. Then, p* = tt(x*). 



R 



W\ 7 W2(w2,i) 



Note, that Corollary 10 only holds for the right choice of index i = i(x*, x). The vector W2{w2, i) 



is defined for any vector W2 G [— n, n] n and any index i G [m]. In the remainder, index i is an 
arbitrary index from [m]. 

We can now define the following event that is parameterized in i, t, and a real e > and that 
depends on w\ and 1112- 

Definition 11. For an index i G [m] and a real t > let p* be the left-most vertex of R = 
R Wl: w 2 ( W2 ,i) w hose slope does not exceed t and let y* be the corresponding vertex of P. For a real 
e > we denote by E^ t E the event that the conditions 

• ajy* — bi and 

• W T(~_ y *{ G (t)t + e], where y is the neighbor y of y* for which ajy < bi, 

are met. Note, that the vertex y always exists and that it is unique since the polytope P is non- 
degenerate. 

Let us remark that the vertices y* and y, which depend on the index i, equal x* and x if we 
choose i — i(x* ,x). For other choices of i, this is, in general, not the case. 

Observe that all possible realizations of u>2 from the line L := {u>2 + x ■ a, : x G K} are mapped 
to the same vector 1^2(^2, i). Consequently, if ui\ is fixed and if we only consider realizations of /x 



for which W2 G L, then vertex p* and, hence, vertex y* from Definition 11 are already determined. 
However, since W2 is not completely specified, we have some randomness left for event E^t g to 
occur. This allows us to bound the probability of event E$ t s from above (see proof of Lemma 13 1. 
The next lemma shows why this probability matters. 
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Lemma 12. For reals t > and £ > let A te denote the event that the path R = R WliW2 has a 
slope in (t,t + e]. Then, A tj£ C [J i=1 ^i,t,e- 

Proof. Assume that event A t£ occurs. Let p be the right-most vertex of R whose slope exceeds t, 
let p* be the right neighbor of p, and let x and x* be the neighboring vertices of P for which 
tt{x) = p and ir(x*) = p* , where 7r = n Wl ^ W2 . Moreover, let i = i{x*,x) be the index for which 
ajx* = bi but ajx < b h . We show that event Ei,i, e occurs. 

Consider the left-most vertex p* of R — R Wl .w 2 {w 2 ,i) whose slope does not exceed t and let y* 
be the corresponding vertex of P. In accordance with Corollary [lO] we obtain y* = x* . Hence, 
ajy* = bi, i.e., the first condition of event Ei.t.e holds. Now let y be the unique neighbor y of y* 
for which ajy < bi. Since y* — x* , we obtain y = x. Consequently, 

wj(y-y*) _ wj{x-x*) 
w \{y~y ) w^[x~x*) 

since this is the smallest slope of R that exceeds t and since there is a slope in (t, t + e] by 
assumption. Hence, event Ei iti£ occurs since the second condition for event Ei iti£ to happen holds 
as well. □ 



With Lemma 12 we can now bound the probability of event A tj£ . 
Lemma 13. For reals t > and e > the probability of event A t>£ is bounded by Pr [A te ] < 4 "" £ . 



Proof. Due to Lemma 12 it suffices to show that Pr [Ejt e ] < — ■ 4 "" e = ^jr^ for any index 
i G [to]. 

We apply the principle of deferred decisions and assume that vector A € (0, l] ra is not random 
anymore, but arbitrarily fixed. Thus, vector ui\ is already fixed. Now we extend the normal- 
ized vector ai to an orthonormal basis {q\, . . . , q n ~i, a^} of R™ and consider the random vector 
(Yi, . . . , Y n _i, Z) T = Q T w 2 given by the matrix vector product of the transpose of the orthogonal 
matrix Q = [qi, . . . , q n -i, ai\ and the vector W2 — [v%, . . . , w n ] • /i. For fixed values y\, . . . , y„_i let 
us consider all realizations of /i such that (Y]_, . . . , Yi-i) = (yi, ■ ■ ■ , 2M-i)- Then, 1U2 is fixed up to 
the ray ri _ 1 

u>2(#) = <9 • (yi, ■ • • > 2/n-i, ■Z') T = y j Vj-qj + Z-a i = w + Z- a % 

3 = 1 

for u> = X)j=i 2/j ' 9j- All realizations of W2(Z) that are under consideration are mapped to the 
same value W2 by the function ui2 ^ u>2(u>2 5 i), i.e., W2(w2(Z), i) — W2 for any possible realization 
of Z. In other words, if W2 = W2(Z) is specified up to this ray, then the path R Wl: w 2 (w 2 ,i) an d, 
hence, the vectors y* and y used for the definition of event Ei t jS , are already determined. 

Let us only consider the case that the first condition of event E^g is fulfilled. Otherwise, 
event Ei jt:£ cannot occur. Thus, event E^t e occurs iff 

, t t + 1 9 w ? • (y - z/*) = wT ■ (y - y*) +z , a ^ • (y - 1) 

wj ■ (y - y*) wj ■ (y- y*) wj ■ (y- y*) 



The next step in this proof will be to show that the inequality |/3| > - is necessary for event Ei itj£ 
to happen. For the sake of simplicity let us assume that \\y — y*\\ = 1 since j3 is invariant under 
scaling. If event Ei jte occurs, then ajy* = bi, y is a neighbor of y* , and ajy 7^ 6,;. That is, by 
Lemmapl Claimp^we obtain \aj ■ (y — y*)\ > S ■ \\y — y*\\ = 5 and, hence, 



1/31 



*T-(v-v*) 



w{ ■ (y - y*) 



> = > T, 7—, T^ > 



\i»T-(v-y*)\ \\wi\\ ■ \\y - y*)\\ n-\ 
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Summarizing the previous observations we can state that if event E ijte occurs, then 
a + Z ■ /? € (t, t + e] C [t - e, t + e]. Hence, 



> 



and 



Z £ 



t — a 



e t — a 

WY~T~ 



c 



t — a e t — a e 



"i(yi,---,y n -i)- 



Let Mi.t, E denote the event that Z falls into the interval I{Y\, ■ ■ ■ , Y n -\) of length ^¥-. We showed 



that E, it£ C Bj t]E . Consequently, 

PrpE i]t)£ ] <Pr[l i)M ]< 



2n-^f 



< 



in 2 e 



6(QT Vl ,...,QTv n ) ~ S 2 ' 
where the second inequality is due to first claim of Theorem |15| By definition, we have 



(Yi, . . . ,y„_i, Z) T = Q T w 2 = Q T • [vi, . 



/z = [Q T ui,. 



' «nj ' M • 



The third inequality stems from the fact that 5(Q Vi, • ■ ■ , Q v n ) — 8{v\, . . . , w n ) > <5, where the 
equality is due to the orthogonality of Q (Claim [2] of Lemmapl. □ 

Lemma 14. Let Y be the number of slopes of R = R WltW2 that lie in the interval (0,1]. Then, 
E[Y] < ^-. 

Proof. For a real e > let F e denote the event from Definition [6] Recall that all slopes of R differ 
by more than e if F e does not occur. Let Z ttS be the random variable that indicates whether R has 
a slope in the interval (t, t + e] or not, i.e., Z ti£ — 1 if there is such a slope and Zt, e — otherwise. 
Then, for any integer k > 1 

I y~!.,-n ^* i if F i does not occur , 



m 



otherwise . 



This is true since ( ™i) ^ rrl ™ is a worst-case bound on the number of edges of P and, hence, of 
the number of slopes of R. Consequently, 



El 



fc-l 








fc-l 


^E E h,i 


+ Pr 


Fi 

k 


•m": 


= E Pr 


i=0 




i=0 


^ 4mn 2 • | 

i=0 


+ Pr 


Fi 

fc 


• m" 


Aran 2 
8 2 



Ai i 

fe ' fe 



Pr 



Pr 



where the second inequality stems from Lemma 13 The claim follows since the bound on E [Y] 
holds for any integer k > 1 and since Pr [F e ] —¥ for e —¥ in accordance with Lemma [8J □ 

Proof of Theorem [71 Lemma [14] bounds only the expected number of edges on the path R that 
have a slope in the interval (0, 1]. However, the lemma can also be used to bound the expected 
number of edges whose slope is larger than 1. For this, one only needs to exchange the order of 
the objective functions wjx and wjx in the projection tt. Then any edge with a slope o f s > 
becomes an edge with slope -. Due to the symmetry in the choice of W\ and W2, Lemma 



14 



also be applied to bound the expected number of edges whose slope lies in (0, 1] for this modified 
projection, which are exactly the edges whose original slope lies in [1, oo). 

Formally we can argue as follows. Consider the vertices x[ — x 2 and x' 2 = x±, the directions 
w[ = —w 2 and w' 2 = — w\, and the projection n' = w w > >w > , yielding a path R' from tt'^x^) to n'(x 2 ). 
Let X be the number of slopes of R and let Y and Y' be the number of slopes of R and of R' , 
respectively, that lie in the interval (0, 1]. The paths R and R' are identical except for the linear 



14 



X 


H^ 


' 


-1 




X 


y. 




-1 


u 




[y\ 



transformation ' i— > ■ ' . Consequently, s is a slope of R if and only if - is a slope 



of R' and, hence, X < Y + Y' . One might expect equality here but in the unlikely case that R 
contains an edge with slope equal to 1 we have X = Y + Y' — 1. The expectation of Y is given by 



Lemma 14 Since this result holds for any two vertices x\ and xi it also holds for x\ and x' 2 . Note, 



that w[ and w 2 have exactly the same distribution as the directions the shadow vertex algorithm 



computes for x[ and x 2 . Therefore, Lemma 14 can also be applied to bound E [Y'] and we obtain 



ELY] < E[y] +E[F'] = 2™p_. □ 

The proof of Corollary [2] follows immediately from Theorem [l] and Claim [4] of Lemma [5] 

7 Some Probability Theory 

The following theorem is a variant of Theorem 35 from [5]. The two differences are as follows: 
In jS] arbitrary densities are considered. We only consider uniform distributions. On the other 
hand, instead of considering matrices with entries from {—1, 0, 1} we consider real- valued square 
matrices. This is why the results from [5] cannot be applied directly. 

Theorem 15. Let X%, . . . ,X n be independent random variables uniformly distributed on (0, 1] 7 
let A = [ai, . . . , a n ] £ M. nxn be an invertible matrix, let (Yy, . . . , Y n _x, Z) T = A ■ (X 1: . . . , X n ) T 
be the linear combinations of Xi, . . . , X n given by A, and let I : W 1 ^ 1 — > {[x, x + e] : x G M} be 
a function mapping a tuple (yi, . . . ,y n -i) to an interval I(yi, . . . ,y n -i) of length e. Then the 
probability that Z lies in the interval I{Y\, . . . , Y n _\) can be bounded by 

9nr 

Pp[zeJ(y 1 ,...,y n _ 1 )]< 



8{ai,...,a„) •min fce[n] ||a fc || ' 
Proof. Let us consider the proof of Theorem 35 of [5] for to = n and k = 1. We obtain 

Pr[Ze/(y 1 ,...,y n _ 1 )]<e-|det(A- 1 )|. f max/ x (A- 1 -(y,2) T )d 2 /, 

where fx denotes the common density of the variables Xi, . . . , X n . In our case, fx is 1 on (0, 1]™ 
and otherwise. Note, that in the proof of Theorem 35 matrix A was an integer matrix and so 
| dct(^4 _1 )| < 1. In this proof considering this factor is crucial. 

It remains to bound J effi „_i max ze i fx {A^ 1 ■ (y, z) T )dy. For this we only have to consider the 
proof of Lemma 36 of J5] since all densities are rectangular functions. Here, we have x — 1 an d 
li = 1 and <j>i = 1 for any i £ [n\. The only point where the structure of matrix A is exploited 
is where | det(P-AT)| for P = [I„_i,O n _i x i] and T = [ei, . . . ,ej_i,ej+i, . . . , e n ] for an arbitrary 
index i e [n] is bounded. Since PAT = A n ^, we obtain 

f max/xL/T 1 • (y,z) T )dy < x ■ £ V | det(^ nji )| • TT l v 

Jyemn-i zem i 6 [»]i=o i'& 

= 2-J2 |det(A,,0|. 

ie[n] 

Summarizing both bounds, we obtain 

Pr [Z G I(Y 1 , . . . , y n _0] < 2e ■ \ det(A- 1 )| • ^ | det(A„,)l = 2e- ^ ' det(A "' l}l 



,det(A)| 

ie[n] i€[n] ' V n 
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We are now going to bound the fraction |det(l')| ■ ^0 °-° t ms i consider the equation Ax = e n . 

We obtain 

|det([ai,. .. ,a,_i,e„,a i+ i,... ,a n ])\ _ |det(A n ,,)| 



111 \det(A)\ |det(A)| ' 

where the first equality is due to Cramer's rule and the second equality is due to Laplace's formula. 
Hence, 

Pr [Z e I(Y U . . . , Y n _ x )} < 2e • ]T tJ^YT = 2e ' H^Hi ^ 2 ^ £ ' Mia 

i6[n] l aet( >^l 

Now consider the equation ^4x = e„ for A — [J\f(ai), . . . ,Af(a n )]- Vector x — Ar x e n is the n th 
column of the matrix A" 1 . Thus, we obtain 

A IT). 

||.t|| < max llrll < 



column 6((2i , . . . , fl n ) 

of A" 1 

where second inequality is due to Claim 111 of Lemma pi Due to A = A ■ diag(||ai||, . . . , ||a n ||), we 
have 



«1 On / V a i a 



Consequently, llxll < — — ^^ — n- and, thus, 



Pr [Z G /(Yr, . . . , Y^)] < 2y/ne ■ , " " „ , 

mm fce[ „] ||a fc | 



2ns 

< ■ □ 

(5(ai,...,a„) •minfcgfn] ||a fe || 
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